338 research outputs found

    Can ground truth label propagation from video help semantic segmentation?

    Get PDF
    For state-of-the-art semantic segmentation task, training convolutional neural networks (CNNs) requires dense pixelwise ground truth (GT) labeling, which is expensive and involves extensive human effort. In this work, we study the possibility of using auxiliary ground truth, so-called \textit{pseudo ground truth} (PGT) to improve the performance. The PGT is obtained by propagating the labels of a GT frame to its subsequent frames in the video using a simple CRF-based, cue integration framework. Our main contribution is to demonstrate the use of noisy PGT along with GT to improve the performance of a CNN. We perform a systematic analysis to find the right kind of PGT that needs to be added along with the GT for training a CNN. In this regard, we explore three aspects of PGT which influence the learning of a CNN: i) the PGT labeling has to be of good quality; ii) the PGT images have to be different compared to the GT images; iii) the PGT has to be trusted differently than GT. We conclude that PGT which is diverse from GT images and has good quality of labeling can indeed help improve the performance of a CNN. Also, when PGT is multiple folds larger than GT, weighing down the trust on PGT helps in improving the accuracy. Finally, We show that using PGT along with GT, the performance of Fully Convolutional Network (FCN) on Camvid data is increased by 2.7%2.7\% on IoU accuracy. We believe such an approach can be used to train CNNs for semantic video segmentation where sequentially labeled image frames are needed. To this end, we provide recommendations for using PGT strategically for semantic segmentation and hence bypass the need for extensive human efforts in labeling.Comment: To appear at ECCV 2016 Workshop on Video Segmentatio

    Deep Depth From Focus

    Full text link
    Depth from focus (DFF) is one of the classical ill-posed inverse problems in computer vision. Most approaches recover the depth at each pixel based on the focal setting which exhibits maximal sharpness. Yet, it is not obvious how to reliably estimate the sharpness level, particularly in low-textured areas. In this paper, we propose `Deep Depth From Focus (DDFF)' as the first end-to-end learning approach to this problem. One of the main challenges we face is the hunger for data of deep neural networks. In order to obtain a significant amount of focal stacks with corresponding groundtruth depth, we propose to leverage a light-field camera with a co-calibrated RGB-D sensor. This allows us to digitally create focal stacks of varying sizes. Compared to existing benchmarks our dataset is 25 times larger, enabling the use of machine learning for this inverse problem. We compare our results with state-of-the-art DFF methods and we also analyze the effect of several key deep architectural components. These experiments show that our proposed method `DDFFNet' achieves state-of-the-art performance in all scenes, reducing depth error by more than 75% compared to the classical DFF methods.Comment: accepted to Asian Conference on Computer Vision (ACCV) 201

    The consequence of excess configurational entropy on fragility: the case of a polymer/oligomer blend

    Full text link
    By taking advantage of the molecular weight dependence of the glass transition of polymers and their ability to form perfectly miscible blends, we propose a way to modify the fragility of a system, from fragile to strong, keeping the same glass properties, i.e. vibrational density of states, mean-square displacement and local structure. Both slow and fast dynamics are investigated by calorimetry and neutron scattering in an athermal polystyrene/oligomer blend, and compared to those of a pure 17-mer polystyrene considered to be a reference, of same Tg. Whereas the blend and the pure 17-mer have the same heat capacity in the glass and in the liquid, their fragilities differ strongly. This difference in fragility is related to an extra configurational entropy created by the mixing process and acting at a scale much larger than the interchain distance, without affecting the fast dynamics and the structure of the glass

    Joint Learning of Intrinsic Images and Semantic Segmentation

    Get PDF
    Semantic segmentation of outdoor scenes is problematic when there are variations in imaging conditions. It is known that albedo (reflectance) is invariant to all kinds of illumination effects. Thus, using reflectance images for semantic segmentation task can be favorable. Additionally, not only segmentation may benefit from reflectance, but also segmentation may be useful for reflectance computation. Therefore, in this paper, the tasks of semantic segmentation and intrinsic image decomposition are considered as a combined process by exploring their mutual relationship in a joint fashion. To that end, we propose a supervised end-to-end CNN architecture to jointly learn intrinsic image decomposition and semantic segmentation. We analyze the gains of addressing those two problems jointly. Moreover, new cascade CNN architectures for intrinsic-for-segmentation and segmentation-for-intrinsic are proposed as single tasks. Furthermore, a dataset of 35K synthetic images of natural environments is created with corresponding albedo and shading (intrinsics), as well as semantic labels (segmentation) assigned to each object/scene. The experiments show that joint learning of intrinsic image decomposition and semantic segmentation is beneficial for both tasks for natural scenes. Dataset and models are available at: https://ivi.fnwi.uva.nl/cv/intrinsegComment: ECCV 201

    Classical and Quantum Chaos in a quantum dot in time-periodic magnetic fields

    Full text link
    We investigate the classical and quantum dynamics of an electron confined to a circular quantum dot in the presence of homogeneous Bdc+BacB_{dc}+B_{ac} magnetic fields. The classical motion shows a transition to chaotic behavior depending on the ratio ϵ=Bac/Bdc\epsilon=B_{ac}/B_{dc} of field magnitudes and the cyclotron frequency ω~c{\tilde\omega_c} in units of the drive frequency. We determine a phase boundary between regular and chaotic classical behavior in the ϵ\epsilon vs ω~c{\tilde\omega_c} plane. In the quantum regime we evaluate the quasi-energy spectrum of the time-evolution operator. We show that the nearest neighbor quasi-energy eigenvalues show a transition from level clustering to level repulsion as one moves from the regular to chaotic regime in the (ϵ,ω~c)(\epsilon,{\tilde\omega_c}) plane. The Δ3\Delta_3 statistic confirms this transition. In the chaotic regime, the eigenfunction statistics coincides with the Porter-Thomas prediction. Finally, we explicitly establish the phase space correspondence between the classical and quantum solutions via the Husimi phase space distributions of the model. Possible experimentally feasible conditions to see these effects are discussed.Comment: 26 pages and 17 PstScript figures, two large ones can be obtained from the Author

    Unsupervised Monocular Depth Estimation for Night-time Images using Adversarial Domain Feature Adaptation

    Get PDF
    In this paper, we look into the problem of estimating per-pixel depth maps from unconstrained RGB monocular night-time images which is a difficult task that has not been addressed adequately in the literature. The state-of-the-art day-time depth estimation methods fail miserably when tested with night-time images due to a large domain shift between them. The usual photo metric losses used for training these networks may not work for night-time images due to the absence of uniform lighting which is commonly present in day-time images, making it a difficult problem to solve. We propose to solve this problem by posing it as a domain adaptation problem where a network trained with day-time images is adapted to work for night-time images. Specifically, an encoder is trained to generate features from night-time images that are indistinguishable from those obtained from day-time images by using a PatchGAN-based adversarial discriminative learning method. Unlike the existing methods that directly adapt depth prediction (network output), we propose to adapt feature maps obtained from the encoder network so that a pre-trained day-time depth decoder can be directly used for predicting depth from these adapted features. Hence, the resulting method is termed as "Adversarial Domain Feature Adaptation (ADFA)" and its efficacy is demonstrated through experimentation on the challenging Oxford night driving dataset. Also, The modular encoder-decoder architecture for the proposed ADFA method allows us to use the encoder module as a feature extractor which can be used in many other applications. One such application is demonstrated where the features obtained from our adapted encoder network are shown to outperform other state-of-the-art methods in a visual place recognition problem, thereby, further establishing the usefulness and effectiveness of the proposed approach.Comment: ECCV 202

    Using Fluorescence Recovery After Photobleaching (FRAP) to study dynamics of the Structural Maintenance of Chromosome (SMC) complex in vivo

    Get PDF
    The SMC complex, MukBEF, is important for chromosome organization and segregation in Escherichia coli. Fluorescently tagged MukBEF forms distinct spots (or 'foci') in the cell, where it is thought to carry out most of its chromosome associated activities. This chapter outlines the technique of Fluorescence Recovery After Photobleaching (FRAP) as a method to study the properties of YFP-tagged MukB in fluorescent foci. This method can provide important insight into the dynamics of MukB on DNA and be used to study its biochemical properties in vivo

    Inner Space Preserving Generative Pose Machine

    Full text link
    Image-based generative methods, such as generative adversarial networks (GANs) have already been able to generate realistic images with much context control, specially when they are conditioned. However, most successful frameworks share a common procedure which performs an image-to-image translation with pose of figures in the image untouched. When the objective is reposing a figure in an image while preserving the rest of the image, the state-of-the-art mainly assumes a single rigid body with simple background and limited pose shift, which can hardly be extended to the images under normal settings. In this paper, we introduce an image "inner space" preserving model that assigns an interpretable low-dimensional pose descriptor (LDPD) to an articulated figure in the image. Figure reposing is then generated by passing the LDPD and the original image through multi-stage augmented hourglass networks in a conditional GAN structure, called inner space preserving generative pose machine (ISP-GPM). We evaluated ISP-GPM on reposing human figures, which are highly articulated with versatile variations. Test of a state-of-the-art pose estimator on our reposed dataset gave an accuracy over 80% on PCK0.5 metric. The results also elucidated that our ISP-GPM is able to preserve the background with high accuracy while reasonably recovering the area blocked by the figure to be reposed.Comment: http://www.northeastern.edu/ostadabbas/2018/07/23/inner-space-preserving-generative-pose-machine

    Simpler Statistically Sender Private Oblivious Transfer from Ideals of Cyclotomic Integers

    Get PDF
    We present a two-message oblivious transfer protocol achieving statistical sender privacy and computational receiver privacy based on the RLWE assumption for cyclotomic number fields. This work improves upon prior lattice-based statistically sender-private oblivious transfer protocols by reducing the total communication between parties by a factor O(nlogq)O(n\log q) for transfer of length O(n)O(n) messages. Prior work of Brakerski and Döttling uses transference theorems to show that either a lattice or its dual must have short vectors, the existence of which guarantees lossy encryption for encodings with respect to that lattice, and therefore statistical sender privacy. In the case of ideal lattices from embeddings of cyclotomic integers, the existence of one short vector implies the existence of many, and therefore encryption with respect to either a lattice or its dual is guaranteed to ``lose more information about the message than can be ensured in the case of general lattices. This additional structure of ideals of cyclotomic integers allows for efficiency improvements beyond those that are typical when moving from the generic to ideal lattice setting, resulting in smaller message sizes for sender and receiver, as well as a protocol that is simpler to describe and analyze
    corecore